Talking heads - communication, articulation and animation
نویسنده
چکیده
Human speech communication relies not only on audition, but also on vision, especially during poor acoustic conditions. The face is an important carrier of both linguistic and extra-linguistic information. Using computer graphics it is possible to synthesize faces and do audio-visual text-to-speech synthesis, a technique that has a number of interesting applications for example in the area of man-machine interfaces. At KTH, a system for rule-based audio-visual text-tospeech synthesis has been developed. The system is based on the KTH text-tospeech system which has been complemented with a three-dimensional parametric model of a human face, that is animated in real time in synchrony with the auditory speech. The audio-visual text-to-speech synthesis has also been incorporated into a system for spoken man-machine dialogue.
منابع مشابه
Compression of MPEG-4 facial animation parameters for transmission of talking heads
The emerging MPEG-4 standard supports the transmission and composition of facial animation with natural video. The new standard will include a facial animation parameter (FAP) set that is defined based on the study of minimal facial actions and is closely related to muscle actions. The FAP set enables model-based representation of natural or synthetic talking-head sequences and allows intelligi...
متن کاملLifelike Talking Faces for Interactive Services
Lifelike talking faces for interactive services are an exciting new modality for man–machine interactions. Recent developments in speech synthesis and computer animation enable the real-time synthesis of faces that look and behave like real people, opening opportunities to make interactions with computers more like face-to-face conversations. This paper focuses on the technologies for creating ...
متن کاملReal-time streaming for the animation of talking faces in multiuser environments
In order to enable face animation on the Internet using high quality synthetic speech, the Text-to-Speech (TTS) servers need to be implemented on network-based servers and shared by many users. The output of a TTS server is used to animate talking heads as defined in MPEG-4. The TTS server creates two sets of data: audio data and Phonemes with optional Facial Animation Parameters (FAP) like smi...
متن کاملAudio-visual quality as combination of unimodal qualities: environmental effects on talking heads
Introduction Talking heads provide a multimodal output component for human-computer-interfaces. They consist of facial visual models that are synchronized with speech synthesis modules concerning speech articulation. Due to their reduction to a human head or upper body, articulation is often more clearly visible compared to a full human body due to the possibly bigger display of the head. There...
متن کاملA Tool for Simplified Creation of MPEG4-Based Virtual Talking Heads
The aim of this master’s thesis was to develop a tool for easier creation of animated talking heads, using the latest animation platform developed at the department of Speech, Music and Hearing at KTH. This animation platform is based on the new, international MPEG4standard for faceand body animation and it allows for arbitrary static face models to be animated and used for speech synthesis. To...
متن کامل